A GENERAL DIVERGENCE MEASURE FOR MONOTONIC 
FUNCTIONS AND APPLICATIONS IN INFORMATION THEORY 



S.S. DRAGOMIR 



Abstract. A general divergence measure for monotonic functions is intro- 
duced. Its connections with the /—divergence for convex functions are ex- 
plored. The main properties are pointed out. 



1. Introduction 

Let (X, A) be a measurable space satisfying \A\ > 2 and fi be a a— finite mea- 
sure on (X, A) . Let P be the set of all probability measures on (A", A) which are 
absolutely continuous with respect to [i. For P, Q G P, let p = j£ and q = 
denote the Radon- Nikodym derivatives of P and Q with respect to ji. 

Two probability measures P, Q G V are said to be orthogonal and we denote this 
by Q _L P if 

P({q = 0})=Q({p = 0}) = l. 

Let / : [0,oo) — * (— 00,00] be a convex function that is continuous at 0, i.e., 
/ (0) = lim ni0 f(u). 

In 1963, I. Csiszar [2] introduced the concept of /—divergence as follows. 



Definition 1. Let P,Q eP. Then 
(1.1) If(Q,P)= [ p(x)f^ lU) 



x 



p(x) 



ol/j (x) , 



is called the /—divergence of the probability distributions Q and P. 

We now give some examples of /—divergences that are well-known and often 
used in the literature (see also 

1.1. The Class of \ a ~ Divergences. The /—divergences of this class, which is 
generated by the function x a , ct £ [1, 00), defined by 

X a (u) = \u-l\ a , U e[0,oo) 

have the form 

(1.2) If(Q,P)= [ V --1 [ p^lg-p^df,. 

Jx P Jx 

From this class only the parameter a = 1 provides a distance in the topologi- 
cal sense, namely the total variation distance V (Q, P) = J x \q — p\ d/i. The most 
prominent special case of this class is, however, Karl Pearson's x 2 ~ divergence. 
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1.2. Dichotomy Class. From this class, generated by the function f a : [0,oo) 
R 

u — 1 — lnu for a — 0; 

/a («) = < STT^ [a« + 1 - " - u a ] for a e K\ {0, 1} ; 



for a = 1: 



1 — u + u In u 

„, ly U, e _ „ _ J („ (., _ , (VS - „*) p^es a stance, «He 
Hettinger distance 



H(Q,P) 



x 



(V? - \/p) 



Another important divergence is the Kullback-Leibler divergence obtained for 
a = 1, 



KL(Q,P) = J q\n f^j dpt. 



1.3. Matsushita's Divergences. The elements of this class, which is generated 
by the function ip a , a € (0, 1] given by 

^ a {u) :=\l-u a \« , ue[0,oo), 
are prototypes of metric divergences, providing the distances \l v (Q,P)] a . 

1.4. Puri-Vineze Divergences. This class is generated by the functions <I> Q , a € 
[1, oo) given by 

1 1 u\ a 

$a(«) := ' ue[0,oo). 
(it + 1) 

— 

It has been shown in g] that, this class provides the distances [I$ a (Q, P)] Q . 

1.5. Divergences of Arimoto-type. This class is generated by the functions 



(1 + u) 



for a € (0,oo)\{l} 



*a (u) := < 



(l + u)ln2 + wlnu- (1 + u) In (1 + u) for a = 1; 



||l-u| 



fo 



r a = oo. 



It has been shown in |Sj that, this class provides the distances [I-gt a (Q, p)] mm ("' <* ) 
for a G (0, oo) and |F (Q, P) for a = oo. 

2. Some Classes of Normalised Functions 

We denote by A4^([0, oo)) the class of monotonic nondecreasing functions de- 
fined on [0, oo) and by Ms ([0, oo)) the class of measurable functions on [0, oo). We 
also consider Ce\ ([0, oo)) the class of measurable functions g : [0, oo) — > K with the 
property that 

(2.1) g (t) < g (1) < g (s) for < t < 1 < s < oo. 
It is obvious that 

(2.2) A^([0,oo)) C £61(10,00)), 
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and the inclusion (|2.2J) is strict. 

We say that a function / : [0, oo) — > R is normalised if / (1) = 0. We denote 
by Aiso ([0, oo)) the class of all normalised measurable functions defined on [0, oo). 
We also need the following classes of functions 

Co ([0, oo)) := {/ £ Msq ([0, oo)) |/ is continuous convex on [0, oo)} ; 
2?o([Q,oo)):={/GMB ([0,oo))|/(i) = (t-l)ff(t), Vie[0,oo), g e ([0, oo))} 
and 

Oo ([0, oo)) := {/ S Ms ([0, oo)) |/ (t) = (t - 1) .g (i) , Vt e [0, oo), .g e £ei ([0, oo))} . 

From the definition of T> ([0, oo)) and Oq ([0, oo)) and taking into account that the 
strict inclusion l|2.2l) holds, we deduce that 

(2.3) Po([0,oo))C(9 ([o,oc)), 

and the inclusion is strict. 

For the other two classes, we may state the following result. 

Lemma 1. We have the strict inclusion 

(2.4) Co ([0, oo)) C£> ([0,oc)). 

Proof. We will show that any continuous convex function / : [0, oo) — > R that is 
normalised may be represented as: 

(2.5) / (t) = (t - 1) .g (t) for any t € [0, oo), 

where 5 e ([0, 00)) . 

Now, let f eCo ([0, 00)) . For A e [£>_/ (1) , D+/ (1)] , define 

/(*) 



5A (t) := 



t - 1 



if t e [0,1) u (1,00), 



A if t = 1. 



We use the following well known result [I] p. Ill]: 
If is convex on (a, 6) and a<s<t<u<b, then 

(2.6) < < #(t,u), 

where 

. , * ft) - * (s) 

If ^ is strictly convex on (a, 6) , equality will not occur in l|2.6|l . 
If we apply the above result for < s < t < 1, then we can state 

f(s) < /(t) 
s - 1 - t - 1 ' 
Taking the limit over t — ► 1, t < 1, we deduce 

^4<£>_/(l) 
s — 1 

showing that for < t < 1, we have (t) < A. 

Similarly, we may prove that for 1 < t < 00, <?a (t) > A. If we use the same result 
for < ti < t2 < 1, then we may write 

f(h) < /(t 2 ) 

*1 - 1 ~ *2 - 1 ' 
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which gives g\ (ti) < g\ (£2) for < ti < ti < 1. 

In a similar fashion we can prove that for 1 < ti < <2 < 00, <?a (ti) < gA (£2) , 
and thus we may conclude that the function g\ is monotonic non-decreasing on the 
whole interval [0, 00). 

If we consider now the function / (f) = (t — 1) e 57 ', t £ [0, 00), we observe that 
f'(t) = (r/i-3)e' ?t , f"(t) = 8e'? t (2t - 1) which shows that / is not convex on 
[0,oo). Obviously, / £ X>o ([0,oo)) , and thus the inclusion Ij2.4|l is indeed strict. | 

Remark 1. If f £ T>o ([0, 00)) and 31, (72 S ([0, 00)) are two functions with 

f(t) = (t-l) gi (t), f(t) = (t-l)g 2 (t) 

for each t £ [0, oo), then we get 

(t - 1) [ 9l (t) - ,g 2 (t)\ = 

for any t € [0, 00) showing that gi (t) = 52 (i) /or eac/i i G [0, 1) U (1, 00) . They may 
have different values in t = 1. 

3. Some Fundamental Properties of /—Divergence for / e Co ([0, 00)) 
For f £ Co ([0, 00)) we obtain the *— conjugate function of / by 

/*(«) = «/ . « 6(0, 00). 

It is also known that if / £ Co ([0, 00)) , then /* £ Co ([0, 00)) . 

The following two theorems contain the most basic properties of /—divergences. 
For their proof we refer the reader to Chapter 1 of [Hj (see also [j^)- 

Theorem 1 (Uniqueness and Symmetry Theorem). Let f, f\ be continuous convex 
on [0, 00). 

(i) We have 

I h (Q,P) = I f (Q,P), 
for any P,Q £ V if and only if there exists a constant c £ R such that 
h(u) = f(u) + c(u-l), 

for any u £ [0, 00); 

(ii) We have 

I r {Q,P) = I f (Q,P), 
for any P,Q £ V if and only if there exists a constant d £ R such that 

r (u) = f(u) + d( c -i), 

for any u £ [0, 00). 

Theorem 2 (Range of Values Theorem). Let f : [0,oo) — ► R be a continuous 
convex function on [0,oo). 

For any P,Q £V ' , we have the double inequality 

(3.1) f(l)<I f (Q,P)<f(0)+f*(0). 

(i) If P — Q, then the equality holds in the first part of J^S.l)) . 

If f is strictly convex at 1, then the equality holds in the first part of 
13.1)) if and only if P — Q; 
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(ii) If Q _L P, then the equality holds in the second part of 

If f (0) + /* (0) < 00 ) then equality holds in the second part of IS. 1\) if 
and only if Q _L P. 

Define the function / : (0, oo) -4 R, /(«) = | (/ (u) + f* (u)) . The following 
result is a refinement of the second inequality in Theorem [21 (see Theorem 3]). 

Theorem 3. Let f e Co ([0, oo)) with f (0) + /* (0) < oo. Then 

(3.2) 0<I f (Q,P)<f(0)V(Q,P) 

for any Q, P £ V . 

4. A General Divergence Measure 

If / : [0, oo) — > R is a general measurable function, then we may define the 
f— divergence in the same way, i.e., if P, Q £ V, then 



I f (Q,P)= [ P (x)f 
Jx 



p(x) 



For a measurable function g : [0, oo) 
the formula 



6 g (Q,P)= f [q(x)- P (x)]g 
Jx 



dfx (x) . 

, we may also define the S— divergence by 
q(x)~ 



P{x) 



dfi (x) . 



It is obvious that the 5— divergence of a function g may be seen as the /—divergence 
of the function /, where / (t) = (t — 1) g (t) for t € [0, oo). 

If / € Co ([0, oo)) and since / (i) = (t — 1) #a (t) , t € [0, oo), we have 



/(*) 



(4.1) 



9x (t) := 



if te [0,l)U(l,oo) 



t - 1 

A if t = 1; 

and A € [£>_/ (1) , (1)] , shows that for any / 6 Co ([0, oo)) we have 

(4.2) I f (Q,P) = Sg x (Q,P) for any P,Q eV, 

i.e., i/ie f— divergence for any normalised continuous convex function f : [0, oo) — > 
R may be seen as the S— divergence of the function g\ defined by ()4.1|) . 

In what follows, we point out some fundamental properties of the 8— divergence. 

Theorem 4. Let g : [0, oo) — » R be a measurable function on [0, oo) and P,Q £ V . 
If there exists the constants m, M with 



(4.3) 



-oo < m < g 



q(x) 



P{x) 



< M < oo 



for /i—a.e. x £ X, then we have the inequality 

(4.4) \5 g (Q,P)\<±(M-m)V(Q,P). 

Proof. We observe that the following identity holds true 



(4.5) 



S g (Q,P)= / [q(x)-p(x) 



x 



9 



q(x) 

P{x) 



m + M 



dfi (x) 
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By l|4.H|l . we deduce that 



fJ 



p(x) 



m + M 



for fi— a.e. x G X. 

Taking the modulus in l|4.5|l we deduce 



\S g (Q,P)\ < / \g(x)-p(x)\ 



x 



< 2 ( M - TO ) 



o (a;) m + M 



p(x) 



< _(M-m) / \q{x)-p(x)\dfx(x) 



x 



= l(M-m)V(Q,P) 
and the inequality l|4.4|) is proved. | 

The following corollary is a natural consequence of the above theorem. 
Corollary 1. Let g : [0, oo) — * M be a measurable function on [0, oo). If 
m := ess inf 5 (t) > —00, M := ess sup 5 (i) < 00, 

t£[0,oo) te[0,oo) 

then for any P,Q £ P, we have the inequality 



(4.6) 



1 



\S g (Q,P)\ <-{M-m)V(Q,P). 



Remark 2. We know that, if f : [0,oo) — > M is a normalised continuous convex 
function and if lim^o /* (t) = hm„|o Wf (~)1 =: /* (0) > then we have the inequality 
[Theorem 2.3] 



(4.7) 



If{Q , P) <m+rn v{Q , P) , 



for any P,Q G P. We can prove this inequality by the use of Corollary^as follows. 
We have 

I f (Q,P) = Sg x (Q,P), 

where 

/(*) 



9x (*) 



t- 1 



if t€[0,l)U(l,oo), 



A if t = 1, 

where A G [-D-/ (1) , D + f (1)] and g\ G A4^([0, 00)). W^e observe that for any 
t G [0, 00), we have 

g x (t) > ( lim <?a (t) = -/ (0) = m > -00 



9\ (t) < lim g x (t) = lim 



lim 



lim 

u->0+ 



l-u 



= f* (0) = M < oo. 



Applying Corollary Q for m = — / (0) and M = /* (0) , we deduce the desired 
inequality 
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The following result also holds. 

Theorem 5. Let g : [0, oo) — > M be a measurable function on [0, oo) and P,Q £ V. 
If there exists a constant K with K > such that 



(4.8) 



.9 



p(x) 



3(1) 



< if 



p(x) 



for (i—a.e. x £ X, where a £ (0, oo) is a given number, then we have the inequality 
(4.9) \5 g (Q,P)\<KI xa+ i(Q,P). 

Proof. We observe that the following identity holds true 

'q(x) 



(4.10) 



6 g (Q,P)= / 



A' 



p(x) 



9(1) 



d/^ (x) . 



Taking the modulus in l|4.10[l and using the condition 14. 8|) . we have successively 



< / \q{x)-p(x)\ 



X 



q{x) 



9(1) 



d[i (x) 



_p(x)_ 

<K f \p(x)]- a \q(x)-p(x)\ a+1 dn(x) 
J x 

< KI x «+i (Q,P) 

and the inequality 14.9(1 is obtained. | 

The following corollary holds. 

Corollary 2. Let g : [0,oo) — * R be a measurable function on [0, oo) with the 
property that there exists a constant K with the property that 

(4-11) \g(t)-g(l)\<K\t-l\ a , 

for a.e. t £ [0, oo), where a > is a given number. Then for any P,Q £ V, we 
have the inequality 

(4.12) \5 g (Q,P)\<KI xa+ i(Q,P). 

Remark 3. If the function g : [0, oo) — > R is Holder continuous with a constant 
H > and /3 £ (0, 1], i.e., 

\g(t)-g(s)\<H\t-sf, 

for any t, s £ [0, oo), then obviously j^.7| ) holds with K = H and a = f3. 
If 9 '■ [0, oo) — » R is Lipschitzian with the constant L > 0, i.e., 

|s(t)-ff(«)| <L|t-*|, 

/or any f, s G [0, oo), i/ien 

(4.13) |<5 s (Q,P)|<tfI x2 (Q,P), 
/or any P,Q £ V. 

Finally, if g is locally absolutely continuous and the derivative g' : [0, oo) — > R 
is essentially bounded, i.e., ||<?'||rg ^ ^ '■= esssup tg [ oo) W Wl < 00 1 then we have 
the inequality 

(4-14) \S g (Q,P)\ < hXo^Ix* {Q,P), 

for any P,Q £ V ■ 

The following result concerning /—divergences for / convex functions holds. 
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Theorem 6. Let f : [0,oo] — > R be a continuous convex function on [0, oo). If 
A £ [D—f (1) , D+f (1)] (X = f (1) if f is differ entiable at t = \), and there exists 
a constant K > and a > such that 

(4.15) |/(i)_ A (t_l)| <^|t-l| Q+1 , 
for any t £ [0, oo), then we have the inequality 

(4.16) 0<I f (Q,P)<KI xa+ i(Q,P), 
for any P,Q £ V ■ 

Proof. We have 

~p(x) 



I f {Q,P)= / [q{x)-p{x)]g x 



x 



q(x) 



dfj, (x) = Sgx (Q, P) 



where 



5A (t) := 



^ if t€[0,l)U(l,oo) ! 



A 



if t = l, 



andA€[D_/(l),D+/(l)]. 

Applying Corollary for g^, we deduce the desired result. | 

5. The Positivity of 5— Divergence for g £ M^([0, oo)) 
The following result holds. 
Theorem 7. If g £ M # ([0,oo)) , t/ien ^ (Q,P) > /or any P,Q £V. 
Proof. We use the identity 
(5.1) S g (Q,P) 

'q(x)~ 



x 



[q (x) - p (x)] g 

q(x) i 



p(x) 



X 



p(x) 

p(x)p(y) 



p(x) 
q(x) 



dfi (x) 
d[i (x) 



x Jx 



P ( x ) 

q (x) q (y) 



p(x) p{y) 







'q{x)~ 




'q{y)~ 






9 


.P( x ). 


- 9 


.p{y). 





dfx (x) dp, (y) . 



Since g £ M.^ ([0, oo)) , then for any t,s £ [0, oo), we have 

(t-8) (g (t)-g(s))>0 

giving that 

q (x) q (y) 







'q(x)" 




'q{y)~ 






9 


_p(x)_ 


- 9 


p(y). 





> 



_p(x) p(y) 

for any x, y £ X. 

Using the representation (|5.1|1 . we deduce the desired result. | 

The following corollary is a natural consequence of the above result. 
Corollary 3. If f £ V Q ([0, oo)) , then I f (Q, P) > for any P,Q £V. 
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Proof. If / € T> ([0,oo)) , then there exists age .M # ([0,oo)) such that f (t) 
(t - 1) g (t) for any t € [0, 00). Then 

~q(x)~ 



I f (Q,P)= f P (x)f 
Jx 



X 



p(x) 
p(x) 



d[i (x) 



- 1 



p(x) 



d[i (x) 



= Sg (Q,P)>0, 



and the proof is completed. | 

In fact, the following improvement of Theorem holds. 
Theorem 8. If g e .M # ([0,oo)) , then 
(5.2) S g {Q,P)> \S\ g \(Q,P)\ >0, 

for any P, Q S V . 

Proof. Since g is monotonic nondecreasing, we have 
(5.3) 



'q(x) q{y)~ 






~q{x)~ 




~q(v)~ 




_p(x) p{y)_ 




9 


_p(x)_ 


- 9 


.p(y). 





> 



qjx) _ q(y) 

p(x) p(y) 

q{x) _ q(y) 

p (x) p (y) 



~q(x)~ 




~q(y)~ 




_p(x)_ 


- 9 


.p(y). 


) 





~q(x)~ 






~q(y)~ 




9 


_p(x) 




9 


p(y). 


) 



for any x, y € X. 

Multiplying l|5.3|) by p (x)p(y) > and integrating on X 2 , we deduce 



x Jx 

> 



p(x)p{y) 



q{x) q(y)' 



p( x ) p(y), 

p(x)p(y) 



X JX 







q(x) 




'q(y) 




) 


9 


p(x) 


- 9 


.p(y) 




q(yV 




q(x)~ 




q 


p(y)y 




p(x) 


- 9 


V 



d[i (x) dfj, (y) 

dfi (x) d/j, (y) 



p(y) 



Using the representation l|5.1|) and the same identity for \g\ , we deduce the desired 
inequality l|5.2|l . | 

Before we point out other possible refinements for the positivity inequality S g (Q, P) > 
0, where g £ M.^ ([0, 00)) , we need the following divergence measure as well: 

~q{x)~ 



S h (Q,P) 



\q{x) -p(x)\ h 



x 



p{x) 



dfi (x) 



which will be called the absolute S— divergence generated by the function h 
[0,oo) — > K that is assumed to be measurable on [0, 00). 
The following result holds. 

Theorem 9. If g S M^([0,oo)) , then 
(5.4) S g (Q,P) 

> max { \S g (Q, P)-V (Q, P) I g (Q, P)\ , \5 ]g] (Q, P)~V (Q, P) I ]g] (Q, P)\}>0, 
for any P,Q £ V . 
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Proof. Since g is monotonic, we have 



(5.5) 



p(x) p(y) 
q(x) 




-1 - 



p(z) 



q( x Y 
p ( x ) . 
q(y) 
p(y) 

q(y) _ 
p(y) 



q{y) 



p{y) 



- 1 





~q(x)~ 




~q(y)~ 




9 


p(x)_ 


- 9 


p(y). 





q(x) 



- 1 



p(y) 



- 1 



p(x) 



qCx) 
pjx) 



q{y) 
p(y) 



p(y) 



for any 

If we multiply i|5.5[l by p (x) p (y) > and integrate, we deduce 



(5.6) 



x Jx 



p(x)p{y) 



p(x) p(y), 

9 





~q(x)~ 




~q(vY 


)(• 


_p(x)_ 


-9 


p(y). 



> < 



IX JX 

X 



p(x) 



p(x) 



p(y) 



dp, (x) dn (y) 



p(y) 



dfi (x) dfi (y) 



J x JxP( x )p(y) 

9 



IX Jx 

X 



p(x) 



p(x) 



i(y) 
p(y) 



i{y) 



- 1 



p(y) 
dfi (x) dfi (y) 



for any i,i/6l. 
Now, observe that 



x Jx 



p{x)p(y) 
p{x)p{y) 



q{x) 



- 1 



p(x) 
q(x) 



q(v) 



x Jx 



p{x)p{y) 



p(x) 
q(x) 



p(y) 

q(x) 



- 1 
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X JX 



p{x) 



2 / P(y)dp(y) / p(x) 
Ix Jx 

q(x) 



q(x) 



_p{x) 

q(y) 
p(y) 

9 



q( x ) 

p(x) 

q(y) 



q(y) 



9 



p(y) 
q{y) 



i 



q{x) 



2 / p(x) 

'X 



1 



p(x) 
dfj,(x) I p{y)g 



x 



p(x) 

2[S g (Q,P)-V(Q,P)I g (Q,P)] , 



p(x) 

q(y) 
p(y) 



p(y) 

dfi (x) 



p{y) 
q(y) 



9 



p(y) 
q( x ) 

p(x) 



dp, (x) dfi (y) 
dfi (x) dfi (y) 
dfi (x) dfi (y) 



dfx(y) 



and a similar identity holds for the quantity in the second branch of (|5.6[l . 
Finally, using the representation |JOJ, we deduce the desired inequality (|!) 



6. The Positivity of 5— Divergence for g e Le\ ([0, oo)) 

The following result extending the positivity of S— divergence for monotonic func- 
tions, holds. 



Theorem 10. If g E Ce 1 ([0, oo)) , then 5 g (Q, P) > for any P,Q eV. 
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Proof. We use the identity 

(6.1) 5 g (Q,P)= ( [q{x)-p{x)]g 



x 



X 



X 



p{x) 
p(x) 



p(x) 
p(x) 



- 1 

- 1 



p(x) 
q(x) 



p(x) 
q(x) 



d^ji (x) 
d/i (x) 



p{x) 



"5(1) 



d^i (x) 



Since g G Ce\ ([0, oo)) , then for any t G [0, oo) we have 

(t-l)[g(t)-g(l)}>0 

giving that 



p(x) 



- 1 



' lip) 
p(x) 



-.9(1) 



> 



for any x S X. 

Using the representation (|6.H . we deduce the desired result. | 

Corollary 4. If f e Oq ([0, oo)) , then I f (Q, P) > /or any P,QeV. 

Proof. If / S Oo([0, oo)), then there exists age £ei([0, oo)) such that / (i) = 
(t - 1) g (t) for any t G [0, oo). Then 



If(Q,P) 



X 



p(x) 



g(3) 

gW _ 

p(x) 



d/i (x) 



q(x) 



p(x) 



S g (Q,P) > 0, 



and the proof is completed. | 

The following improvement of Theorem 1101 holds . 
Theorem 11. If g E Cei ([0, oo)) , then 
(6.2) S g (Q,P)>\6 lgl (Q.P)\>0 
for any P,Q E V . 

Proof. Since g E Ce\ ([0, oo)) , we obviously have 



(6.3) 



gW 

_p(x) 



q{x) 



> 



gW 

p(x) 

gW 

p(x) 



p(x) 

J ) (fl 



■5(1) 



gW 



1 



3 



p(x)_ 
q(x) 



_p (x) 



5(1) 
"15(1)1 
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Multiplying (|6.3(l by p (x) > and integrating on X, we have 



x 



p(x) 
= I p{x) 
I p(x) 



q{x) 



p(x) 



p(x) 



p(x) 
p(x) 



- 1 
- i 



-.9(1 

p(x) 
p(x) 



d[i (x) 



-b(i) 



dpi (x) 



\g(i)\)dn(x) 



*W (Q- p )| 



and the inequality i|tj.2|) is proved. | 

7. Bounds in Terms of the x 2 — Divergence 
The following result may be stated. 

Theorem 12. Let g : [0, oo] — > R be a differentiable function such that there exists 
the constants 7, T € R luii/i 

(7.1) 7 < g' (t) < T /or any t £ (0, 00) . 
Then we have the inequality 

(7.2) 1 D X , (Q, P) < 8 g (Q, P) < TD X 2 (Q, P) , 
for any P,Q S V . 

Proof. Consider the auxiliary function /i 7 : [0, 00] — > R, /i 7 (t) := g (t) — 7 (t — 1) . 
Obviously, /i 7 is differentiable on (0,oo) and since, by (|7.1|) . 

^; (*) = 5 ' (t) - 7 > 

it follows that h y is monotonic nondecreasing on [0,oo). 
Applying Theorem we deduce 

5 h ,(Q,P)>0 for any PQeV 

and since 



K (Q,P) 



'g-i( 



-i) (Q,P) 



[q (x)-p(x)] 



X 





~q(x)~ 




_p(x) 




g 


_p(x)_ 


- 7 





d[i (x) 



= 5 g (Q,P)- 1 D X 2 (Q,P), 

then the first inequality in Ij7.2|l is proved. 

The second inequality may be proven in a similar manner by using the auxiliary 
function h T : [0, 00) -> M, h v (t) := T (t - 1) - g (t) . | 

The following corollary is a natural application of the above theorem. 

Corollary 5. Let f : [0,oo] — > R be a differentiable convex function on (0, 00) with 
f (1) = 0. If there exist the constants 7, T € R with the property that: 

(7.3) 7 (t - 1) 2 + / (t) < /' (t) (t - 1) < / (t) + r (t - 1) 2 

/or any t € (0, 00) , then we have the inequality: 

(7.4) 1 D X , (Q, P) < I f (Q, P) < TD X 2 (Q, P) 
for any P,Q £ V . 
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Proof. We know that for any P, Q G V, we have (see for example Q4.2JI ): 

I f (Q,P) = 6 gf , w (Q,P), 

where 



/(*) • 



9f'(i) 



t - 1 



if te [0,l)U(l,oo) 



/'(I) if t=l. 

We observe that, by the hypothesis of the corollary, <?/'(i) is differentiable on (0, oo) 
and 

ff/'w (*) = ^rrp 

for any i G (0, 1) U (l,oo) . 
Using l|7.3|l . we deduce that 

7 < 3/'(d (*) < r 

for t G (0, oo) , and applying Theorem El above, for g — <7/'(i), we deduce the 
desired inequality l|7. 4JI . | 

8. Bounds in Terms of the J— Divergence 
We recall that the Jeffreys divergence (or J— divergence for short) is defined as 

q(x)~ 

[q W - p W j in 

X 



(8.1) 



J(Q,P) ■= / [q{x)-p(x)}ln 



p(x) 



d/i (x) 



where P, Q G V . 

The following result holds. 



Theorem 13. Let g : [0, oo] —>R be a differentiable function such that there exists 
the constants cf>, $ G ffi with 

(8.2) <f> < tg (t) < $ for any t G (0, oo) . 
Then we have the inequality 

(8.3) ( f>J(Q 1 P)<5 g (Q 1 P)<^J(Q,P), 
for any P,Q^V. 

Proof. Consider the auxiliary function : [0, oo) — > R, h<j,(t) := g (t) — 4>\nt. 
Obviously, h$ is differentiable on (0, oo) and, by (|8.2p . 

K (t)=g'(t)-^ = j[tg' (t)-d>0, 

for any i G (0, oo) , showing that the function is monotonic nondecreasing on (0, oo). 
Applying Theorem we deduce 

S h „ (Q, P) > for any P,QeV 

and since 



8 K (Q,P) = S g ^ ln( . ) (Q,P) 



[q (x)-p(x)] 



x 



q(x) 



p(x) 



> In 



P(x) 



d[i (x) 



8 g {Q,P)-<t>J{Q 1 P) 
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then the first inequality in H8.3J) is proved. 

The second inequality may be proven in a similar manner by using the auxiliary 
function h$ : [0, oo) — > R, (t) := $lnf — g (t) . I 

The following corollary is a natural application of the above theorem. 

Corollary 6. Let f : [0, oo] — > R be a differ entiable convex function on (0, oo) with 
/ (1) = 0. If there exist the constants 6 K with the property that: 

(8.4) 4> (t - l) 2 + tf (t) < t (t - 1) /' (t) < tf (t) + $ (t - i f 
for any t € (0, oo) , then we have the inequality: 

(8.5) c/)J(Q,P)<I f (Q,P)<$J(Q,P) 
for any P,Q £ V ■ 

The proof is similar to the one in Corollary and we omit the details. 
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